Search CORE

7 research outputs found

An investigation of supervector regression for forensic voice comparison on small data

International audienceThe present paper deals with an observer design for a nonlinear lateral vehicle model. The nonlinear model is represented by an exact Takagi-Sugeno (TS) model via the sector nonlinearity transformation. A proportional multiple integral observer (PMIO) based on the TS model is designed to estimate simultaneously the state vector and the unknown input (road curvature). The convergence conditions of the estimation error are expressed under LMI formulation using the Lyapunov theory which guaranties bounded error. Simulations are carried out and experimental results are provided to illustrate the proposed observer

HAL Evry

Crossref

Springer - Publisher Connector

Automatic speaker recognition using phase based features

Author: Thiruvaran Tharmarajah, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW
Publication venue: University of New South Wales. Electrical Engineering & Telecommunications
Publication date: 01/01/2009
Field of study

Despite recent advances, improving the accuracy of automatic speaker recognition systems remains an important and challenging area of research. This thesis investigates two-phase based features, namely the frequency modulation (FM) feature and the group delay feature in order to improve the speaker recognition accuracy. Introducing features complementary to spectral envelope-based features is a promising approach for increasing the information content of the speaker recognition system. Although phase-based features are motivated by psychophysics and speech production considerations, they have rarely been incorporated into speaker recognition front-ends. A theory has been developed and reported in this thesis, to show that the FM component can be extracted using second-order all pole modelling, and a technique for extracting FM features using this model is proposed, to produce very smooth, slowly varying FM features that are effective for speaker recognition tasks. This approach is shown herein to significantly improve speaker recognition performance over other existing FM extraction methods.A highly computationally efficient FM estimation technique is then proposed and its computational efficiency is shown through a comparative study with other methods with respect to the trade off between computational complexity and performance. In order to further enhance the FM based front-end specifically for speaker recognition, optimum frequency band allocation is studied in terms of the number of sub-bands and spacing of centre frequencies, and two new frequency band re-allocations are proposed for FM based speaker recognition. Two group delay features are also proposed: log compressed group delay feature and the sub-band group delay feature, to address problems in group delay caused by the zeros of the z-transform polynomial of a speech signal being close to the unit circle. It has been shown that the combination of group delay and FM, complements Mel Frequency Cepstral Coefficient (MFCC) in speaker recognition tasks. Furthermore, the proposed FM feature is successfully utilised for automatic forensic speaker recognition, which is implemented based on the likelihood ratio framework with two stage modelling and calibration, and shown to behave in a complementary manner to MFCCs. Notably, the FM based system provides better calibration loss than the MFCC based system, suggesting less ambiguity of FM information than MFCC information in an automatic forensic speaker recognition system.In order to demonstrate the effectiveness of FM features in a large scale speaker recognition environment, an FM-based speaker recognition subsystem is developed and submitted to the NIST 2008 speaker recognition evaluation as part of the I4U submission. Post evaluation analysis shows a 19.7% relative improvement over the traditional MFCC based subsystem when it is augmented by the FM based subsystem. Consistent improvements in performance are obtained when MFCC is augmented with FM in all sub-categories of NIST 2008, in three development tasks and for the NIST 2001 database, demonstrating the complementary behaviour of MFCC and FM features

UNSWorks

The I4U system in NIST 2008 speaker recognition evaluation

Author: Bin Ma
Changhuai You
Chien-lin Huang
Donglai Zhu
Eliathamby Ambikairajah
Eng-siong Chng
Haizhou Li
Hanwu Sun
Ismo Kärkkäinen
Khe Chai Sim
Kong-aik Lee
Lirong Dai
Mohaddeseh Nosratighods
Qin Jin
Rong Tong
Tanja Schultz
Thiruvaran Tharmarajah
Vladimir Pervouchine
Wu Guo
Yijie Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/12/2009
Field of study

This paper describes the performance of the I4U speaker recognition system in the NIST 2008 Speaker Recognition Evaluation. The system consists of seven subsystems, each with different cepstral features and classifiers. We describe the I4U Primary system and report on its core test results as they were submitted, which were among the bestperforming submissions. The I4U effort was led by th

CiteSeerX

Crossref

The I4U Submission to the 2012 NIST Speaker Recognition Evaluation

Infoscience - École polytechnique fédérale de Lausanne

I4U Submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification

I4U is a joint entry of nine research Institutes and Universities across 4 continents to NIST SRE 2012. It started with a brief discussion during the Odyssey 2012 workshop in Singapore. An online discussion group was soon set up, providing a discussion platform for different issues surrounding NIST SRE’12. Noisy test segments, uneven multi-session training, variable enrollment duration, and the issue of open-set identification were actively discussed leading to various solutions integrated to the I4U submission. The joint submission and several of its 17 sub-systems were among top-performing systems. We summarize the lessons learnt from this large-scale effort

Infoscience - École polytechnique fédérale de Lausanne

An investigation of supervector regression for forensic voice comparison on small data

Author: A Drygajlo
BC Haris
BJ Guillemin
C Zhang
CC Chang
CC Huang
Chee Cheun Huang
D Meuwly
D Ramos-Castro
DA Reynolds
DA Reynolds
DA Reynolds
DL Donoho
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
GS Morrison
I Naseem
J Friedman
J Gonzalez-Rodriguez
J Gonzalez-Rodriguez
J Gonzalez-Rodriguez
J Pelecanos
J Wright
JMK Kua
JR Deller
Julien Epps
M Li
M Li
MAT Figueiredo
N Brümmer
N Dehak
N Dehak
P Kenny
P Kenny
P Rose
P Rose
RE Fan
S Cumani
S Davis
S Furui
T Kinnunen
Tharmarajah Thiruvaran
V Boominathan
WM Campbell
WM Campbell
WM Campbell
X Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref